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Abstract 



This report was written for the seminar Computational Logic 3 con- 
ducted at the Institute of Computer Science of the University of Innsbruck. 
It is concerned with a SAT solver approach for deciding LPO-termination 
of term rewrite systems and a continuation of the seminar report X3 which 
considers a BDD approach. After relevant algorithms are explained, ex- 
perimental results are reported. 
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1 Introduction 



Termination is one of the most important properties of term rewrite systems 
(TRSs). In general it is undecidable but for certain classes of TRSs powerful 
methods have been developed to decide termination. One of these methods is 
the lexicographic path ordering (LPO). In this seminar report an alternative 
algorithm to the one given in |H| for deciding LPO-termination suggested by 
Codish et al. in [2j is discussed. Implementations of both encodings are com- 
pared to the standard TjT implementation (10 ) on a database of 773 TRS 
instances. The main idea of the approaches in [2] and [£j is to extract the 
constraints LPO puts on a precedence for a given TRS into a propositional 
formula. Afterwards some more constraints are added to ensure the properties 
of a precedence and finally the propositional formula is tested for satisfiability. 
The two approaches differ in the way the additional constraints are expressed 
and how the result is tested for satisfiability. j$] uses binary decision diagrams 
(BDDs) whereas [2] integrates the SAT solver MiniSat 3 . 

In Section |21 some simple definitions are fixed and results mentioned. Sec- 
tion |21 describes how to get the constraints for LPO-termination and explains 
how the additional constraints are constructed. After a short discussion about 
optimisations in Section^ the run time results are presented in Section[5J Some 
remarks about the paper |2j can be found in Sectional The report is concluded 
with ideas for future work which are mentioned in Section [Tj Appendix ^con- 
tains a proof that the version of LPO defined here is indeed a simplification 
order and Appendix iBl shows how C++ can be interfaced with OCaml. 

2 Preliminaries 
2.1 Relations 

Definition 1. A quasi-order is a reflexive and transitive relation. A proper or- 
der is an irreflexive and transitive relation. An equivalence relation is reflexive, 
symmetric, and transitive. 

Lemma 1. Let £3 be a quasi-order on a set A. Then {a y b \ a £3 b and not 
6 £3 a} is a proper order and {a ~ b | a £3 b and b £3 a} is an equivalence 
relation on A. Furthermore £3 = y t±J ~ where ttl denotes disjoint union. □ 

Remark 1. When talking about quasi-orders it sometimes is convenient to 
specify the strict and equivalence part separately. So f £3 f , g £3 g, h £3 h, f £3 g, 
g f> f h, and g £3 h is then written as f ~ g, f y h, and g y h or even as 
f ~ g and f y h because g y h follows implicitly from the transitivity of £3. 

Next the lexicographic extension of a quasi-order is defined. 

Definition 2. Let £3 be a quasi-order. The lexicographic extension ^ lex is then 
defined as follows: 

(si,...,s m ) £ lex (ti,...,i n ) ^m>0A 

(n = V (n > A (si y h V (si ~ t x A (s 2 , ...,s m ) £ lex (t 2 , ■ ■ •>*»>)))) 
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When dealing with proper orders, equivalence amounts to equality. 
2.2 LPO 

Let T be a signature and V a set of variables. Then T(J-, V) denotes all terms 
which can be build over T and V. A term rewrite system (TRS) 7Z = (J-, R) 
consists of a signature T and a set of rewrite rules R C T{!F, V) x T(.F, V). We 
write I — > r G i? instead of (i, r) G R. 

The basic idea of term rewriting is to apply these rules to terms. We skip the 
details of how they are applied. A general question which arises is whether 
this process of applying rules stops at some time. We call this (undecidable) 
property of 1Z termination. For several classes of TRSs powerful methods have 
been developed to determine if the given instance is terminating. In the sequel 
we focus on the lexicographic path ordering which provides a sufficient condition 
for termination. (As it is not a necessary condition there are terminating TRSs 
which cannot be proved terminating by LPO.) 

Definition 3. A quasi-precedence £3 (strict precedence >-) is a quasi-order 
(proper order) on a signature T '. Sometimes we find it convenient to call a 
quasi-precedence simply precedence. 

Note that as an immediate consequence of Lemma ^ we can build a quasi- 
precedence out of a strict one by adding reflexivity. Therefore all results carry 
over. Next the induced order ^i po for a given precedence will be defined. There- 
fore we split it into its strict (>-i po ) and its equivalence part (~i po )- First con- 
sider ~i po . 

Definition 4. Let £3 be a precedence and s,t G T(!F,V). We define s ~i po t if 
one of the following alternatives holds: 

(1) s = t, or 

(2) s = f(si, . . .,s m ),t = g(h, . . .,t m ),f ~ g, and Si ~i po U for all 1 < i < m. 

After that the strict part can be defined as follows: 

Definition 5. Let £3 be a precedence and s,t G T(J 7 , V). If s, t €" V then 
s = f(si, . . . , s m ) and t = g(t\, ■ ■ ■ , t n ). We have s >-\ po t if one of the following 
alternatives holds: 

(1) / ~ g and either there is an i G {1, . . . ,m} such that Sj ~i po tj for all 
1 < j < i, Si >-i po U, and s ^i po tj for all i < j < n, or m > n and 
s i ~ipo U for all 1 < i < n, or 

(2) f >~ g and s ^i po tj for all 1 < j < n, or 

(3) there is an i G {1, . . . , m} such that Sj ^i po i- 

When s H po * by clause (j) in the definition above, we write s M po t (j). If 
convenient we add the index i G {1, . . . , m} in cases (1) and (3). Be aware that 
(1) might refer to case (1) where the second alternative applies or in a more 
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general view where the exact alternative is not important. The context clarifies 
the exact meaning. When considering strict precedences >- the equivalence 
relation ~i po amounts to syntactical identity. 

Definition 6. A TRS 1Z = (J-, R) is called ( quasi- )LPO -terminating if and 
only if there exists a precedence y such that I y\ po r for alH — > r 6 R. In case 
that £3 is a proper order we speak of strict LPO-termination. 

Note that as soon as the precedence £3 is fixed the induced order >3i po is 
determined uniquely. In order to show that if a TRS is LPO-terminating it is 
actually terminating it suffices to show that ^i po is a simplification order. The 
proof can be found in Appendix El 

Example 1. Let 1Z\ be the TRS consisting of the following rule: 

f(y,g(x),x) -> f(y,x,g(g(x))) 

We want to determine if f(y,g(x),x) y\ po f (?/> x j g(g( :r )))- Clearly case (1) of 
the definition applies and the i is 2. So it remains to test that g(x) y\ po x and 
f(y,g(x),x) y\ po g(g(x)). The first holds by (3). For the latter we use (2) and 
therefore we need f y g in the (strict) precedence and f(y,g(x),x) y\ po g(x) 
which again holds by (3). Hence this instance is strict LPO-terminating with 
strict precedence f >- g and LPO-terminating with quasi-precedence f y f, 
g y g, and f y g. 

Let 7Z2 be the TRS consisting of the following two rules: 

f(x) -> g(x) 
g(x) -» f (x) 

Strict LPO-termination is not the case since the first rule can only be handled by 
setting f >- g whereas the second one requires g >~ f . But as a strict precedence 
y must be transitive, f >- g and g >- f yield f y f which clearly contradicts 
irreflexivity. The system is also not quasi-LPO-terminating because if f ~ g 
then the first rule does not fulfil f(x) yi po g(x). Anyway, it would be strange if 
any formalism would state that the system is terminating because it obviously 
is not! 

Let 7Z3 be the TRS (from [2]) consisting of the following three rules: 

div(x, e) — > i(x) 
\(d\v(x,y)) -> d\v(y,x) 
div(div(x, y),z) -> div(y, div(i(x), z)) 

Again the system is not strict LPO-terminating because the first rule demands 
div y i and the second one i >- div. But it is LPO-terminating with quasi- 
precedence div ~ i. 
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3 Algorithms for LPO Termination 



In this section at first an LPO encoding is presented in order to extract the 
demands on a precedence. Then the approach of 2j is reviewed to ensure 
the necessary properties of a precedence. In this approach function symbols 
are interpreted as natural numbers and then ordered by the greater or equal 
relation. Thus reflexivity and transitivity are enforced automatically. 



3.1 LPO Encoding 

As we have seen in the previous section we need to represent a precedence in 
order to decide LPO-termination. The following definition will take care of 
encoding a relation in propositional logic. 

Definition 7. Let X = {Xf g \ f,g G T with / ^ g} be a set of propositional 
variables and Y such a similar set. An assignment a induces relations on T as 
follows: / >- g if and only if a(Xf g ) = T and / ~ g if and only if a(Yj g ) = T. 

Equivalence of terms using the Yf g variables can then be encoded as follows: 

Definition 8. Let s,t £ T(JF,V). If s,t g' V then s = f(s\,...,s m ) and 
t = g(ti, . . .,t n ): 

[ T if s = t 

E(Y) s>t =l Y fg AE(Y) sutl A---AE(Y) Smttm if m = n 

y _L otherwise 

Next we generalise the definition of the lexicographic order to terms. Note 
that there is a nested recursion between C(X, Y) St t which encodes LPO and 
LEX(X, Y)/ Slj s m ) >*n> which takes care of comparing the arguments lexi- 
cographically. 

Definition 9. 

LEX(X, y)( ai ,.„, Sm ),(t 1 ,.„,t n ) = 

_L m = 

T m > and n = 

C(X,Y) Sl!h y 

{ {E(Y) Sutl ALEX(X,Y) {s2 _ Sm)>{t2 _ tn) ) m > and n > 

Now all preparation for the LPO encoding is done and C(X, Y) S} t in the 
following definition is a propositional formula which exactly mirrors the con- 
straints LPO puts on the precedence in order to ensure s ^i po t. The definition 
below is as general as possible and can also cope with quasi-precedences. The 
slight differences for strict precedences are indicated in the boxes. The first line 
refers to strict precedences (note that no Y variables are needed) and the second 
line to quasi-precedences. Consider the third branch of Definitional first. The 
two function symbols of the terms s and t are different. For strict precedences 
we demand / y g and we skip the first k — 1 equal arguments (s ^i po ti for all 
1 < I < k holds automatically if si = ti for all 1 < I < k) and demand that 
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s y\ po ti for all k < I < n. In case of a quasi-precedence we do not know if / >- g 
or / ~ g. In the former case we do the same as for the strict case whereas in the 
latter we must find a strict decrease in the arguments. Because the first k — 1 
ones are equal they are trivially equivalent and do not yield a strict decrease. 
It might happen that the k-th arguments are equivalent and the strict decrease 
is in some later arguments. That is why the lexicographic extension is applied 
here. It ensures that the arguments with the strict decrease are found. Finally 
s ^i po ti for all k < I < n ensures that all remaining arguments of t are smaller 
than s. The fourth branch of the definition considers equal function symbols 
which are trivially equivalent. For strict precedences the k-th. argument must 
be the one with the strict decrease whereas for quasi-precedences it again might 
happen that s k ~i po tk an d so on. Note that the constraints for each branch in 
the definition below make the nondeterministic definition of LPO (Definition [SJ) 
deterministic. 



Definition 10. Let s,t G T(T,V). If s,t g" V then s 
t = g(ti, . . . ,t n ): 



C(X,Y) 



s.t 



_L 



f(s 1 , ...,s m ) and 
if s = t or s € V or both t G V and t ^ Var(s) 



T if s V, t G Var(s) 
CE(X,Y)s,t 



_L 



Y fg ALEX(X,Y) 



{s k ,...,s m ),{t k ,...,t n ) 



AC k (X,Y) S)t 



if s^t,s#V,t#V, and// 5 
CE(X,Y). t 



V 



C ( X )s k ,t k 
L EX ( X i Y )(s k ,...,s m ),{t k ,...,t n ) 



AC k+1 (X,Y)s,t 



with 



and 



if s t, s V, t £ V, and / = g 



CE(X, Y) 8>t = \/ E{Y) sa V \/ C(X, Y) sut 



i=i 



i=l 



Q(X,Y) s>t = /\C(X,Y) Sitj 



where k is the minimum value of i (1 < i < m) with Sj ^ ti. 

According to DefinitionElit is sufficient to test / >~i po r for all rules I-trGfi 
to ensure LPO-termination. The next definition expresses the quasi-LPO as 
well as the strict LPO constraints not only for single rules but for whole TRSs 
(setting for Y suggests that no Y variables are used, i.e., they all evaluate to 
F). 
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Definition 11. 

C(X,Y) = /\{C(X,Y\ r \ l^r£R} and C(X) = C(X,0) 

Example 2. Let's first focus on strict LPO-termination. Therefore we com- 
pute C(X, 0) = C{X) for the first two TRSs of Example ^ given in the previous 
section. Note that as we are concerned with strict precedences E{Y)s,t is abbre- 
viated to E St t because no Y variables are allowed. Consequently, two terms can 
only be equivalent if they are equal. Using the abbreviations s = f(y,g(x),x) 
and t = f (y, x, g(g(x))) we get for 1Z\ 

C{X) = C(X) M 

= CE{X) B>t V {C(X) g{x)>x A C 3 (X) s , t ) 

= E V)t V E g ( x ) }t V E Xjt V C(X)y )t V C(X) g ( x ^ t V C(X) x j 

V(TAC(I) s , g(g(l)) ) 

= ±V±V±V±V (CE(X) g(x) j V Xgf A C 1 (X)g (x)}t ) V 1 

v c(^) s ,g( g (x)) 

= [E X)t V C(X) Xjt V Xgf A C(X) g(a .) )2/ A C^g^)^ A C(-X")g(a:),g(g(a!))) 

V C(^) s ,g(g(x)) 

= (-L V _L V X g f A _L A T A C(X) g{xU{g(x)) ) V C(X) S)g(g(a;)) 

= C(^) S ,g(g(x)) 

= C^(X) Sig ( g(:r )) V (X fg A Cl(X) S)g(g(a; ))) 

= E y,s(s(x)) v ^g(*),B(g(*)) v s *,g(g(*)) v c ( x )y,g(&(x)) v C '( X )g(^),g( g (^)) 

V C{X) X)g{g{x)) V (X fg A Ci(^)« )g (g(x))) 

= ±v±v±v±v c(X) g(a:)!g(g(a:)) v ± v (Xfg a Ci(x) Sjg(g(:c)) ) 

= C '(^)g( :E ),g(g( :c )) V (Xfg A c(x) Sig(:r) ) 

= C J B(X) g ( a .) )g ( g(a .)) V C(X) a , )g ( a .) V (Xfg A C(X) S)g(a .)) 

= ^as,g(g(aO) v C{ X )x,g(g(x)) V _L V (Xfg A C(X) S)g(a ,)) 

= 1 V ± V (Xfg A C(X) 8)g(x) ) 

= Xfg A {CE(X) 8tg{x) V (Xfg A Ci(X) S)g(x) )) 

= Xfg A (^(a,) V £g(:r),g(:E) V ^,g(a-) V C(X) tf)g(a .) V C(X) g(a ,) )g(a .) 

V CpQ^j V (Xfg A Ci(X) S)gW )) 

= X fg A(_LVTV_LV_LV_LV_LV (X fg A C(X) SjX )) 
= Xfg A T = Xfg 
For the second instance we get 

C(X) = C(X)^ x - )y g( x) A C{X) g{x ^^ x) 

= (C^(X) f ( a .) )g ( a .) V (Xfg A C(X) f ( x ) )a .)) 

A (C£(X) g(a;))f(:E) V (Xgf A C(X) g(!e) >!B )) 
= (-^(a) v C^.g^) V (Xfg A T)) 

A (E xm V C x>f(x) V (Xgf A T)) = Xfg A X gf 



6 



The resulting constraints can be satisfied but they do not correspond to a strict 
precedence because any satisfying assignment would yield f >- g and g y f. A 
strict precedence is irreflexive and transitive but the relation above contradicts 
irreflexivity if it is closed under transitivity. The interested reader is asked to 
verify that C(X,Y) = Xf g /\X g f and therefore no quasi-precedence exists which 
shows LPO-termination of this instance. The third TRS (now computed with 
Y variables and some details omitted) yields 

C(X, l r )div(a;,e),i(a!) = CE(X,Y) d 

V (X diVii V (Ydiv.i A LEX(X,Y)( X ^^)) A C(X, Y) d]v ( Xfi ^ x 
= J_ V (A div>i V (F diVji A T)) A T 

= -^div,i V Ydiv,i 
C(X, "^)i(div(x,y)),div(j/,x) = CE(X, ^)i(div(a;,y)),div(y,a;) 

V (X\ jd \ v V (Yj jd i v A LEX{X,Y)^ v ^ x ^^ y)X ))) 

A C(X, ^)i(div(a;,j/)),y A C(X, ^)i(div(o;,2/)),i 

= J_ V X iidiv V (Y^iv A T) A T A T = X ijdiv V Y^iv 
C(X,Y) 

d\v(d\v(x,y),z),d\v(y,d\v(\(x),z)) 
= CE(X, Y)d\ v (di\v(x,y),z),d\v(y,<i\v(\(a:),z)) V LEX(X, ^)(div(x,j/),z),(y,div(i(a;),2)) 

A C(X, ^) d iv(div(x, 3 /),2),div(i(x),2) 
= _L V T A C(X, y)div(div(x,»),z),div(i(;E),*) 

= CE(X, l r )div(div(a;,2/),«),div(i(a;),z) V LEX(X, ^)(div(a:,j/),z),{i(a;),«> 

A C(X, 5^)div(div(a;,j/),z),z 
= _L V C(X, ^) d iv(x,y),i(x) A T 

= CE(X, y)div(a:,3/),i(a:) V (^div,i V (Xdiv,i A LEX(X, Y)^^)) 

A C(X, y)div(j;,2/),a;) 
= J_ V (X div ,i V y diVji A T) A T 

= ^div,i v y d i V: i 

The conjunction of the three constraints above amounts to 

c(x,y) = (x div ,i v y diVii ) a (x iidiv v ii, d iv) 

and therefore the TRS admits the quasi-precedence div ~ i. 
3.2 A Symbol Based Encoding 

After computing C(X,Y) one can infer a precedence from these constraints 
(if the necessary properties of a precedence can be satisfied). Remember that 
for a quasi-precedence reflexivity and transitivity have to be ensured. The 
idea proposed in |2j is to assign to every function symbol a positive integer 
value. The greater or equal than relation (>) on natural numbers then ensures 
that the function symbols are quasi-ordered. In fact the order is even total. 
Let \J-~\ = n. Then we are looking for a mapping m : T — > {1, . . . ,n} such 
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that for every propositional variable Xf g £ X we have m{f) > m(g) and for 
Yf g € Y Tn(f) = m(g). In order to encode these constraints in propositional 
logic integers are represented in binary notation. To uniquely encode n function 
symbols, k := \ld(n)~\ bits are needed. Here ld() denotes the logarithm to the 
basis 2. The next definition shows how the constraints for the X (strict part) 
and Y (equivalence part) variables can be formalised. The £;-bit representation 
of / is (fk, . . . , fi) with fk the most significant bit. 

Definition 12. 

f (/lAnft) k = l 

1 (fk A ->g k ) V ((f k «-► g k ) A \\X fg \\ k _i) k > 1 
k 

f\(fi ft) 

i=l 

After this step it is rather easy to define the whole propositional formula 
which is satisfiable if and only if the given TRS is (strict) LPO-terminating. 

Definition 13. 

B(X, Y) = C(X, Y) A /\ (z^\\z\\ k ) 
zexuY 

B(X) = C(X)A /\(z~\\z\\ k ) 

Lemma 2. B(X,Y) (B(X)) is satisfiable if and only if the TRS 1Z is (strict) 
LPO-terminating. □ 

Note that this encoding introduces new variables (i.e., after constructing 
C(X, Y) additional variables are needed to enforce symmetry and transitivity. 
Let again \T\ = n. Then for every function symbol k := \ld(n)~\ additional vari- 
ables are added which makes a total of 0(n x ld(n)) new variables. Nevertheless 
the problematic part arises from C(X,Y) which typically has 0(n 2 ) variables. 
But as this approach adds a conjunct for each propositional variable appearing 
in C(X,Y) it thus adds 0(n 2 ) conjuncts of size 0(ld(n)) (cf. Definition H2*)l . 
Run time results show that these additional variables do not pose a problem 
for MiniSat (Section 15X31) . 

This section is concluded with a note on the symbol based encoding. Differ- 
ent satisfying assignments do not necessarily give rise to different total orders 
on the function symbols. Consider the following example: 

Example 3. Let C(X, Y) = Xf g A Y g ^. Then the three mappings mi,m2,Tn^ : 
T -> {1,2,3} with TOi(f) = 2, rai(g) = mi(h) = 1, m 2 (f) = 3, m 2 (g) = 
7712(h) = 1, and 7743(f) = 3, 7743(g) = wia(h) = 2 yield the same precedence 
f >- g, f >- h, and g ~ h. 



\Xfg\\k 

\\ Y fg\\k 
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ip A T - 




m f\ J_ - 




i^Vl - 


-> CO 


cp V T 


-> T 


T A cp - 


-» CO 


_L A cp - 


-» 1 


_L V cp - 


-» 


T V cp - 


-» T 



Table 1: A bunch of simplifications. 



ip A (if V ip) 


-> cp 


(p A V ip) 


-» cp 


(<p V ip) Aip 


-» V 


(ip V ip) A cp 


-» V 


cp A cp 


cp 


tp V (cp A *0) 


-4 CP 


cp V (-0 A cp) 


-» cp 


(ip Aip) \/ <p 


-» cp 


(ip A ip) V cp 


-» cp 


cp V cp 


-> cp 



Table 2: Some more simplifications. 



4 Optimisations 

4.1 Modifications 

In addition to the constraint encoding in [2] some slight modifications have 
been implemented. The propositional formula resulting from the LPO encod- 
ing typically contains many occurrences of _L and T. What happens when the 
simplifications of Table are performed before testing for satisfiability? Is it 
even better if some more simplifications like the ones in Table are added? 
Although not explicitly mentioned, the implementation of [2] incorporates the 
equivalences of Table ftj The simplifications of Tables ^ an d HI have been in- 
tegrated in our implementation. Run time results show that the first bunch 
of simplifications should really be employed whereas the second one does not 
give an enormous speedup. The reason why the simplifications of Table ^ are 
so effective is that they usually reduce the number of propositional variables. 
For example Xf g A X g h A Xh^ A _L is reduced to _L. The author of this report is 
convinced that some further simplifications would be useful in order to reduce 
the number of subformulas, which is relevant for the transformation to CNF 
(cf. Definition Ull. 
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(1) ip AT -» 99 ^VT ^ T 
TA^3 -> (/; T V </? — > T 
</? A _L ^ _L 92 V _L ^ (/? 
J_ A ip — ► ± ±V p ^ ip 

(2) ip — > ip — > pp> V ip ^f->i/i -> y>Ai/)V pp A —iip 

(3) -1(99 A ip) — > -199 V -t0 -i(tp\/ip) — > -upApip 

(4) -> y> 

(5) (^V(^Ax) -» ((^VV)A(^Vx) 
(y> A ^) V x -» (</> V X ) A V X ) 

Table 3: The standard transformation into CNF. 



4.2 Constructing CNFs Efficiently 

SAT solvers typically expect their input in conjunctive normal form but for 
the majority of the TRSs the formula B(X,Y) is too large for the standard 
translation which consists of the five steps depicted in Table 01 The problem 
there is that the resulting CNF may be exponentially larger than the input 
formula because the two rules of step (5) duplicate one of the variables. In 
jllj Tseitin proposed a transformation which is linear in the size of the input 
formula. The price for linearity is paid with introducing new variables. As a 
consequence, Tseitin's transformation does not produce an equivalent formula. 
E.g., a tautology may no longer be a tautology because of the new variables. 
But it preserves and reflects satisfiability. The basic idea of this transformation 
is simple: In order to transform the formula <p introduce for every non-atomic 
subformula ip a new variable p^ . Atoms ip are identified with p^ . The transla- 
tion of the formula is presented in the definition below. NASub(p) denotes all 
non-atomic subformulas of ip and * represents all binary connectives. 

Definition 14. 

Tseitin(ip) = p^ A /\ (py, <-> (p^ * p^ 2 ) ) A f\ (p^, <-> -np^J 
t{jeNASub(ip) ip£NASub(<p) 

The attentive reader may be puzzled because the definition above does not 
produce a CNF. However, every of the conjuncts above can be represented in 
CNF using at most four clauses (cf. Table 0}. This section is concluded with a 
simple example for Tseitin's transformation. 



x <- 


-> (ip A ip) 


(pp> V pip V x) A (ip V -n X ) A (V V -X) 


x <- 


*{ipVip) 


-> (pV ip V -<x) A V x) A V X) 


x <- 


-> (ip -> Ip) 


(^V^V -.*) A V x) A V x) 


x *- 


-> (p <-> Ip) 








A (y? V pip V -! X ) A (pp VipV p\) 


x <- 


-> (pip) 


-> (y> V x) A (-.p V -x) 



Table 4: Some equivalences. 
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Example 4. Let ip = {q A ->r) V s. Then 



NASub((p) = { if , q A ->r, ->r } 

and 

Tseitin(ip) = p v A (p v <-» (p^ V s)) A (p^ <-> (g A p^ 2 )) A (p^, 2 <-> -ir). 
4.2.1 Some Remarks 

In order to get a shorter representation one can try to replace <-> in Definition ll4l 
by — This is a bit dangerous as the following example demonstrates. 

Example 5. Let ip be the prepositional formula -i(pV —>p) and therefore 
NASub(ip) = { tp , (p V -np), ->p }. 

Pf PVl ^^2 

The transformed formula is satisfiable although the original formula —>{p V -ip) 
is unsatisfiable. Below an example satisfying assignment is provided. 

Tseitin'((p) = p^ A ( p v ->■ -> p^ ) A ( p^ -> ^P_^ v 2fy 2 ) 

T T F F F F 

A (jty 2 -> -■ p ) 
F F 

In the sequel it will become clear that this problem is caused by negations 
not applied to atoms. Therefore we restrict ourselves to negation normal form 
(NNF), i.e., negations are only allowed in front of atoms. Steps (1) - (4) of 
Table |3] are unproblematic as far as complexity is concerned and transform 
the input into NNF. Afterwards the changed transformation comes into action. 
It introduces a new formula for each non-literal subformula instead of every 
non-atomic subformula. NLSub((p) denotes all non-literal subformulas of ip. 



Definition 15. 



Tseitin'((p) = p<p A f\ (p^ -» (p^ *p^ 2 )) 



4>£NLSub(<p) 

Valid precedences correspond to satisfying assignments of the formula. There- 
fore it is important not to lose a precedence and, moreover, to get valid ones 
only. The former is to say that every satisfying assignment of ip can be ex- 
tended (note that there are additional variables) to a satisfying assignment 
of Tseitin'(ip). The latter expresses that every satisfying assignment for the 
transformed formula should also satisfy the original one. 

Lemma 3. Let a be an assignment with a{ip) = T. Then a can be extended to 
some a 1 such that a' {T seitin' {p>)) = T. 
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Proof. As the result holds for the original transformation the same assignment 
also satisfies Tseitin'(cp) because its formulation is weaker. □ 

Lemma 4. Let a be an assignment. If a(Tseitin' (cp)) = T then a{ip) = T. 

Proof. The proof is done by induction on the number of non- literal sub formulas. 
As an abbreviation define: 



con j (cp) 



cp if cp is a literal 

/\ (Zty (P^i otherwise 

^eNLSub(ip) 

1p=1pl*1p2 



The base case amounts to verifying that for all literals the statement holds. 

• If cp is a literal, then Tseitin'(cp) = cp by definition and the result follows 
immediately. 

In the inductive step we must verify that for all binary operators (as NNF is 
considered there are only A and V) the result holds. 

• cp = Ipi A 1p2 ■ 

a{Tseitin'{ip)) = T 

a(p v A conj(cp)) = T 
<==> a(p^ A {ptp -> pfa Ap^) A conj(ipi) A conj(ip 2 )) = T 

a(p^ A (p^ -> p^ Ap^ 2 )) = T and a(conj(ipi) A con j (fa)) = T 
=> a(p^ !1 ) = T and a(p^ 2 ) = T and a(conj(fa)) = T 

and a(conj(fa)) = T 

a(p^ !1 A con j (fa)) = T and a(p^, 2 A con j (fa)) = T 
ct(T seitin (fa)) = T and q(T seitin (fa)) = T 

a(V'i) = T and a(fa) = T 
a(V>i A ^2) = T 
a(<p) = T 



IH 



• 99 = V>1 V V>2: 
similar 

□ 

Unfortunately the encoding of [2 does not produce a propositional formula 
in NNF (the one in [6.. does!). Test results show that it is cheaper to use 
Tseitin's transformation in its original definition because the transformation to 
negation normal form - although linear - is too expensive. But one refinement 
to Tseitin's transformation can be made (and is implemented) which really 
speeds up the whole process: Consider only non-literal subformulas instead 
of non-atomic ones. The transformed formula is then not necessarily in CNF 
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because atoms may occur with more than one negation. Removal of those 
negations is computationally cheaper than considering also negated atoms as 
subformulas. Also note that the propositional formula B(X, Y) already consists 
of some conjunctions, say c\ A • • • A c n . Then Tseitin(c\) A • • • A Tseitin(c n ) is 
computed a bit faster than Tseitin{c\ A • • • A c n ). Remember that the n is at 
least the number of rewrite rules plus function symbols occurring in B(X,Y). 

In a different (from the one in Definition I15JI formulation is presented 
where <-» in Definition El is a ls° replaced by — >. In this version the transformed 
formula might have more satisfying assignments if the input formula is satis- 
fiable. It would be sufficient for LPO-termination if the transformed formula 
is satisfiable (because also in this transformation the output formula is satisfi- 
able if and only if the input formula was) but a satisfying assignment can no 
longer be used for reading off a valid precedence because not every satisfying 
assignment of the output also satisfies the input. 

5 Experimental Results 

In this section the standard LPO implementation of TjT (referred to as TTT) 
is compared with two of the three different atom based encodings described 
in using the implementation of Q] (referred to BDD2 and BDD3) and a 
new implementation (referred to as SAT) of the symbol based encoding due 
to [2] and described above. Of course the comparison is a bit unfair since the 
atom based encodings use BDD techniques to test satisfiability whereas the 
symbol based approach interfaces the high-end SAT solver MiniSat. For the 
test benches a database of 773 TRSs [H] is considered and the results of in- 
teresting examples are described in more detail. All tests were performed on 
cl2-inf ormatik . uibk . ac . at, a server with two Intel@ Xeon™ processors run- 
ning at a CPU rate of 2.40GHz. The system memory is 512MB in total. 
The abbreviations TO (for timeout, 10 seconds) and SO (for stack overflow) are 
used. The simplifications of Tables ^ and E] together with the most effective 
variable order wao (cf. |13j ) were employed. Times in the tables are in seconds 
and include reading the input file, deciding LPO-termination, and printing a 
precedence if there exists one. 



TRS 



TTT BDD2 



BDD3 



SAT 



Cime_tree 



TO 
TO 
TO 
TO 
TO 
TO 
TO 
TO 



0.044 
0.056 
0.083 
0.089 
0.447 
0.032 
0.061 
0.091 



0.043 
0.054 
0.082 
0.052 
0.449 
0.032 
0.061 
0.091 



0.025 
0.018 
0.018 
0.015 
0.449 
0.020 
0.042 
0.091 



currying_AG01_3.10 
currying_AG01_3.13 
currying_Ste92_hydra 



HM_t005 

SK90.4.47 

various_14 



Zantema_z30 



Table 5: Problematic instances for TTT. 
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TRS 


BDD3 


BDD2 


TTT 


SAT 


AProVE_AAECC-ring 


0.10 


TO 


0.06 


0.12 


Cime_quick 


TO 


0.07 


0.01 


0.02 


HM_t009 


TO 


TO 


0.07 


0.13 


Rubio_enno 


0.21 


0.06 


0.00 


0.02 


Rubio_wst99 


0.41 


0.93 


0.01 


0.02 


secret2005_cime3 


0.13 


6.94 


0.02 


0.06 


TRCSR_Exl_2_AEL03_C 


0.33 


TO 


0.12 


0.16 


TRCSR_Exl_2_AEL03_GM 


0.11 


TO 


0.03 


0.07 


TRCSR_Exl4_AEGL02_C 


TO 


TO 


0.03 


0.03 


TRCSR_Exl4_AEGL02_FR 


0.21 


0.22 


0.01 


0.02 


TRCSR_Exl_GL02a_C 


TO 


TO 


0.03 


0.04 


TRCSR_Exl_GM03_C 


0.69 


TO 


0.03 


0.04 


TRCSR_Exl_Luc02b_C 


3.29 


1.87 


0.02 


0.03 


TRCSR_Ex26_Luc03b_C 


TO 


TO 


0.06 


0.06 


TRCSR_Ex26_Luc03b_FR 


0.90 


TO 


0.04 


0.05 


TRCSR_Ex2_Luc02a_C 


6.29 


TO 


0.06 


0.14 


TRCSR_Ex2_Luc03b_C 


0.79 


TO 


0.03 


0.08 


TRCSR_Ex3_2_Luc97_C 


TO 


TO 


0.03 


0.04 


TRCSR_Ex3_2_Luc97_FR 


0.44 


TO 


0.03 


0.04 


TRCSR_Ex3_3_25_Bor03_C 


0.71 


TO 


0.04 


0.04 


TRCSR_Ex3_3_25_Bor03_FR 


0.11 


TO 


0.03 


0.03 


TRCSR_Ex4_7_37_Bor03_C 


3.25 


TO 


0.06 


0.06 


TRCSR_Ex49_GM04_C 


0.93 


TO 


0.05 


0.04 


TRCSR_Ex5_7_Luc97_C 


0.12 


TO 


0.09 


0.12 


TRCSR_Ex5_7_Luc97_FR 


0.12 


TO 


0.04 


0.06 


TRCSR_Ex5_7_Luc97_GM 


0.50 


TO 


0.03 


0.07 


TRCSR_Ex5_7_Luc97_Z 


0.13 


TO 


0.04 


0.05 


TRCSR_Ex6_15_AEL02_C 


8.13 


TO 


0.18 


0.25 


TRCSR_Ex6_15_AEL02_FR 


1.00 


TO 


0.11 


0.09 


TRCSR_Ex6_15_AEL02_GM 


TO 


TO 


0.04 


0.12 


TRCSR_Ex6_15_AEL02_Z 


1.03 


TO 


0.09 


0.08 


TRCSR_Ex7_BLR02_C 


TO 


TO 


0.04 


0.05 


TRCSR_Ex8_BLR02_C 


TO 


TO 


0.04 


0.04 


TRCSR_Ex9_BLR02_C 


0.34 


5.82 


0.05 


0.04 


TRCSR_ExAppendixB_AEL03_C 


0.57 


TO 


0.16 


0.18 


TRCSR_Exlntrod_GM01_C 


3.14 


TO 


0.04 


0.04 


TRCSR_Exlntrod_GM04_C 


6.31 


TO 


0.02 


0.04 


TRCSR_Exlntrod_GM99_C 


5.68 


TO 


0.10 


0.08 


TRCSR_Exlntrod_GM99_FR 


0.30 


TO 


0.03 


0.03 


T RCS R_Exl nt rod _G M 99_G M 


0.15 


TO 


0.03 


0.06 


TRCSR_Exlntrod_Zan97_C 


TO 


TO 


0.06 


0.06 


TRCSR_ExSecll_l_Luc02a_C 


TO 


TO 


0.08 


0.08 



Table 6: Instances where BDD approach fails. 
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TRS SAT 



BDD2 



BDD3 



TTT 



AProVE_AAECC-ring 0.117 

Cime_mucrll 0.296 

HM_t005 0.449 

HM_t009 0.123 

TRCSR_Exl_2_AEL03_C 0.150 

TRCSR_Ex5_7_Luc97_C 0.117 

TRCSR_Ex6_15_AEL02_C 0.233 

TRCSR_Ex6_15_AEL02_GM 0.109 
TRCSR_ExAppendixB_AEL03_C 0.172 

Zantema_z30 0.091 



TO 

0.298 
0.447 

TO 

TO 

TO 

TO 

TO 

TO 
0.091 



0.081 
0.296 
0.449 

TO 
0.332 

TO 
8.134 

TO 
0.572 
0.091 



0.062 
0.932 

TO 
0.062 
0.121 
0.089 
0.171 
0.038 
0.151 

TO 



Table 7: The 10 hardest problems for the symbol based encoding. 



5.1 Comparing the Three Approaches 

In this section the most expensive instances for each approach are discussed. 
Currently there is no implementation for quasi-LPO-termination following the 
atomic encoding idea. Therefore only results concerning strict LPO-termination 
are reported here. A concise comparison for quasi-LPO-termination between 
the symbol based encoding and TjT can be found in [2]. 

5.1.1 Problematic TTT Instances 

Here the eight instances where TjT could not decide LPO-termination within 
a timeout of ten seconds are considered. In Table El the execution times of the 
alternative approaches are shown. The BDD approach is much better than TTT 
(for these instances) and SAT is even faster. 

5.1.2 Problematic Instances for the Atom Based Encoding 

In Table H3 the instances which could not be handled by BDD2 or BDD3 (either 
because of timeout or stack overflow) are compared with the results of the TTT 
and SAT implementations. There are three different types of table entries. Times 
for a successful decision of LPO-termination are written in roman font. Italic 
numbers indicate the time after which a stack overflow occurred and TO means 
that LPO-termination could not be decided within the given timeout. Note 
that stack overflows only occurred using BDD3. Whereas both BDD approaches 
usually have difficulties with the same instances these ones seem to cause no 
problem for TTT and SAT. 

5.1.3 Problematic Instances for the Symbol Based Encoding 

Surprisingly there are no really problematic TRSs for SAT. No timeouts or stack 
overflows occurred and the run time results are unequivocal. Table shows the 
ten hardest instances for the symbol based approach. Just note that although 
the SAT approach seems favourable (cf. Section f5. 2 I) BDD3 is equally fast for the 
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method 


cpu time 


timeouts 


stack overflows 


BDD3 


116 


4 


31 


BDD2 


466 


36 





TTT 


117 


8 





SAT 


8 









Table 8: The three approaches tested on 773 TRSs. 



two hardest SAT problems. The reason is that for these two systems almost all 
execution time is used for generating the LPO constraints. 

5.2 A Global View 

Until now only separated instances have been considered. This is the right 
place to apply the different methods to all of the 773 TRSs. Table |H1 compares 
the results of the BDD approach with TTT and SAT. The format of the table 
entries is cpu time/time outs /stack overflows. So the first entry should be 
low (execution time) and also the sum of the second and third (where decision 
of LPO-termination fails). Comparing the results of both BDD approaches 
the big difference is the 31 stack overflows caused by BDD3. As explained in 
JT3] these typically occur within the first second when computing the cycles 
whereas BDD2 computes minimal paths and minimal cycles until the timeout 
is reached. (So 31 x 10 = 310 which is roughly the difference between the 
two execution times). The last two lines list the results for TTT and SAT. The 
BDD approach does not perform so well here because a subclass of TRSs in 
the database causes problems. This sub class is formed by some of the TRSs 
which fit the naming TRCSR_*. The reason for this is the number of cycles in 
the domain graph. Without these instances the BDD approach performs much 
better (see ^H])- SAT can handle all instances (no timeouts, no stack overflows) 
and thus is preferable. 

Finally let us compare our SAT implementation with the results of the orig- 
inal poS AT implementation of [2] ■ Table presents the 773 instances tested 
with SAT and compares the results with the 751 instances tested with poSAT. 
Remember that the numbers were produced on different machines and thus it 
might be dangerous to compare them directly. 





poSAT 


SAT 


Total 


9.112 


7.666 


Average 


0.012 


0.010 


Max 


0.450 


0.449 


(a) 


strict LPO 







poSAT 


SAT 


Total 


10.428 


10.690 


Average 


0.014 


0.014 


Max 


1.169 


0.532 



(b) quasi LPO 



Table 9: 751 instances for poSAT versus 773 instances for SAT. 
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TRS 


poSAT 


SAT 


HM_t005 


0.450 


0.443 


Cime_mucrll 


0.294 


0.313 


currying_AG01_No_3.13 


0.127 


0.019 



Table 10: Run times for costly to encode instances. 



5.3 General Remarks 

While performing the tests for the atom based encoding two major problems 
arose. The one with the stack overflow and the timeouts. The former is caused 
by the cycle computation needed for BDD3. The current algorithm to obtain 
all cycles can surely be improved but will still remain computationally expen- 
sive. The troubles with the timeout splits into two subproblems. Either the 
constraint formula B(X,Y) cannot be computed within the allowed time or 
the timeout occurs while building the BDD. In the first case no refinement 
will help but in the second extra memory may help. The reason is that if the 
BDD becomes bigger the whole amount of 512MB memory is allocated within 
some seconds and because of intensive swapping most computations seem to 
last forever. 

Concerning the symbol based approach there won't be much refinements 
for further speedup. The authors of [2] do memoization when computing the 
LPO constraint C(X,Y) because for some instances the same test s y\ po t is 
performed over and over again. Maybe that is just because the definition of H po 
is a bit less efficient compared to the one presented here. Our implementation 
does no memoization and produces more or less equal results. Table E3 lists 
three examples where C(X,Y) simplifies to T. That is to say that almost the 
whole execution time is spend on computing the LPO encoding of the instance. 

6 Comments on the Paper 

Reference 6 formed the prerequisites of this work. Afterwards the focus shifted 
to the alternative encoding presented in |2j. With the first paper as preparation 
the second one is easy to understand. The examples given help to grasp the 
definitions. Also the algorithm is tested on a large database of TRSs which 
allows to draw valid conclusions about the results which are then presented in 
detail. So the impression of the paper is rather good but some details should 
have been explained more precisely (e.g., Tseitin's transformation, interface 
with MiniSat). Therefore re-implementing the algorithm was more work than 
anticipated and at first the times obtained in [2j could not be reproduced. 
Furthermore it is unclear which transformation to CNF actually was applied. 
Either the one in which has the advantage that precedences can be inferred 
and is also the one applied here, or the one in ^2] which might be faster but 
does not allow one to conclude a valid precedence. 
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7 Future Work 

As addressed in Appendix El the current interface for MiniSat is kind of mini- 
malistic. To accelerate the whole process more C++ data types and maybe also 
classes should be interfaced. Also the encoding itself could be optimised a bit. 
Equivalences of the form Xf g <-> ->X g f as well as Yf g <-> Y g f might reduce the 
number of propositional variables in C(X, Y) by a factor 2 and therefore fewer 
constraints have to be added for ordering the function symbols. The alternative 
transformation to CNF in |12| might also give some speedup. An extension of 
the current work to MPO (Multiset Path Order) might be doable without a 
big effort - just design an encoding for the MPO constraints. Although MPO 
can solve instances which LPO cannot it is somehow weaker than LPO which 
can decide termination of more TRSs in the database 5 (strict/quasi-LPO can 
prove 128/132 TRSs terminating, strict MPO only 93, 88 of these instances can 
be proved by both). Relating this approach to the dependency pair method ( 1) 
where finding an appropriate argument filtering is one of the main bottlenecks 
( 4 ) may be worth a consideration. How to efficiently encode the constraints 
for that problem needs some further investigation. 

8 Summary 

In this seminar report the main ideas of a symbol based encoding for LPO- 
termination proposed in [2] are explained. Furthermore an implementation in 
OCaml has been produced and its run time results are compared to the poSAT 
implementation of 0, the standard LPO implementation of TjT and a BDD 
implementation which follows a different approach ([E]). The run time results 
are unequivocal, i.e., the symbol based encoding is the clear winner. 
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A LPO is a Simplification Order 



In the sequel we show that ^i po as defined in Section 12,21 indeed is a simplifi- 
cation order and therefore a sufficient condition for termination. First we have 
to prove some properties of the relation ~i po - 

Lemma 5. ~i po is an equivalence relation that is closed under substitutions. 

Proof. As reflexivity, symmetry, and transitivity are obvious we only show clo- 
sure under substitutions, i.e., s ~i po t implies sa ~i po to for all terms s,t, and 
substitutions a. Assume that s ~i po t. If s = t then clearly sa = to and 
therefore sa ~i po to. For the other case in the definition we do induction on 
||s|| + ||i||. In the base case we have ||s|| = ||t|| = 0. Then s ~i po t amounts 
to s = t and therefore sa = ta which implies sa ~i po to. In the inductive step 
assume that s' ~i po t! implies s'a ~i po t'a for all s',t' with ||s'|| + \\t'\\ < k 
(k > 0) and let + \ \t\\ = k. Assuming s ~i po t yields s = f(s%, . . . ,s m ) and 
t = g(ti, . . . ,t m ) with / ~ g and Sj ~i po U for all 1 < i < m. The induction 
hypothesis yields SiO ~i po tiO for all 1 < i < m and thus sa ~i po ta. □ 

Lemma 6. The inclusions ~i po ■ M po Q Mpo and >-\ po ■ ~ipo C Mpo hold. 

Proof. We only prove ~i po ■ Mpo f= Hpoj i-e-, if s ~i po t and t >~\ po u then 
s ^lpo u i because the other inclusion is similar. The proof is done by induction 
on ||s||-l-||i|| + |M|- Let s = f{s\, . . . , s m ) and t = g(ti, . . . , t m ) with / ~ g. In 
the base case ||s|| = ||t|| = 1 and = 0. t ^i po u can only be by (3, i) and as 
u = x for some x £ V we have t$ = u. Furthermore Sj = ti = u yields s >-\ po u 
by (3,i). For the inductive step assume that s' ~i po t' y\ po u' implies s' y\ po u' 
for all terms s',t', v! with | |s'|| + 1 \t'\ | + ||it'|| < k (k > 2) and ||s|| + ||t[| + ||it|| = k. 
Consider the three cases 

(1) Let t >-i po u by case (1). Then u = h(u%, . . . , u p ) with g ~ h. First 
consider the case where tj ~i po Uj for all 1 < j < i, U ^i po ih, and 
t >~i po Uj for all i < j < p. s ~i po t implies Sj ~i po tj for all 1 < j < m and 
by transitivity of ~i po also sj ~i po Uj for all 1 < j < i. Sj ~i po ti ^i po Uj 
implies S{ >-\ po Ui by the induction hypothesis and s ~i po t >~\ po Uj implies 
s ^i po uj for all i < j < p by the induction hypothesis. Therefore s ^\ po u 

In the other case we have n > p and tj ~i po u% for all 1 < i < p. 
Since s ~i po t m = n and Sj ~i po for all 1 < i < n. Transitivity of 
~i po yields Sj ~i po Uj for all 1 < i < p which together with m > p proves 
s M P o t (1). 

(2) t ^i po it (2). Then u = h(u\, . . . ,u p ) with g >- h and t y\ po Uj for all 
1 < j < s ^i po -Uj can be obtained by applying the induction hypothesis 
to s ~i po t ^i po for 1 < j < p and therefore s y\ po u (2) because f >~ h. 

(3) t ^i po u (3, i). Then either ti ~i po u or ti y\ po u. For the former we 
get Si ~i po ti ~i po u and by transitivity of ~i po also Sj ~i po n and thus 
s ^i P o u (3j*)- The latter yields Sj ~i po >-i po n and thus Sj ^i po u by 
the induction hypothesis and finally s ^i po u (3,i). 
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□ 



Next we show that ^i po is a simplification order. The proofs are slightly 
adapted from [5] where only strict precedences are considered. 

Lemma 7. >~i po is a rewrite relation, i.e., closed under contexts and substitu- 
tions. 

Proof. Closure under contexts: It suffices to show that s y\ po t implies C[s] ^i po 
C[t] for all contexts of the form f(u\, u m ). Let □ be the i-th ar- 

gument of C. Assume s y\ po t. We have to show C[s] >~i po C[t\. C[s] = 
f(ui,...,s,...,u m ) Mp f(ui,...,t,...,u m ) (l,i) because Uj ~i po Uj (even 
Uj = Uj) for all 1 < j < i, s >-\ po t by assumption, and C[s] y\ po tj for all 
i < j < m. 

Closure under substitutions: By induction on ||s|| + ||i|| we show that s >-\ po t 
implies sa y\ po to. Assume s S^i po t. Therefore let s = f(s±, . . . ,s m ). In the 
base case ||s|| = 1 and ||t|| = 0. As t G V there must be an i (1 < i < m) 
with Si ~i po t (even s, = t). Then clearly Sj<7 ~ ta and therefore sa y\ po to 
by (3, i). For the inductive step assume that s' Hpo t' implies s'a H po t'a 
for all substitutions a and terms s',t' with + \ \t'\\ < k (k > 1) and let 
INI + 11*11 = k- Furthermore s = f(s\,...,s m ) and t = g(t\, ■ ■ ■ , t n ). We 
distinguish three cases 

• If s y\ po t (1, i) or (1) then / ~ g. First consider the case where Sj ~i po tj 
for all 1 < j < i, Si >~i po tj, and s y\ po tj for all i < j < n. Lemma El 
yields SjO ~i po tjO for all 1 < j < i and the induction hypothesis yields 

^lpo ti°~, and sa y\ po tjO for all i < j < n. Consequently, sa y\ po ta 
(l,i). In the other case where m > n and Sj ~i po U for all 1 < i < n 
Lemma yields Sia ~i po tia for all 1 < i < n and therefore sa ^\ po ta 

<!)• 

• If s ^i po t (2) then / y g and s ^i po tj for all 1 < % < n. The induction 
hypothesis yields sa y\ po tj<7 for all 1 < i < n and thus also sa ^i po ta 
by (2). 

• If s ^i po t (3, i) then Sj ~i po t or Si y\ po t. For the former Sjcr ~i po ta 
holds by Lemma for the latter we have s^a >-i po ta by the induction 
hypothesis. So in both cases sa >~\ po ta by (3,i). 

□ 

Lemma 8. ^i po is a proper order, i.e., it is irreflexive and transitive. 

Proof. Before proving transitivity and irreflexivity we note that whenever there 
are terms s = f(s±, . . . , s m ), t = g(ti, . . . , t n ) with s >-\ po t then s ^i po U for 
all 1 < i < n. The easy proof is the same as for strict precedences and can 
be found in |S]. Proving transitivity amounts to s y\ po t and t y\ po u implies 
s >-i po u. Let s = f(s\, . . . , s m ) and t = g(t\, ■ ■ ■ , t n ) ^\ po u. We show the result 
by induction on ||s|| + ||t|| + \\u\\. For the base case we have ||s|| = \\t\\ = 1, 
= 0. Clearly t >-\ po u by (3,i) and therefore the desired s y\ po i» = u 
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follows. For the inductive step assume s' y\ po t' >~i po v! implies s' >~i po u' for 
all terms s',t' , u' with | | + | \t'\ \ + ||u'[| < k (k > 2) and | + \ \t\ \ + \ \u\ \ = k. 

• — Suppose s ^i po t and t y\ po u Then u = h(u±, . . . , u p ) 

with / ~ g ~ h and therefore / ~ h. Let u = min{i,j}. We show 
that s ^i po u (lj/x) by proving (a) s/ ~i po ui for all 1 < I < (J,, (b) 
s M >-i po Up, and (c) s >-i po ui for all < I < p. 

(a) Since I < i,j clearly si ~i po U[. 

(b) The following three cases may appear: If % = j = fj, then y\ po 
tp >~ipo Up implies s p y\ po u^ by the induction hypothesis. If i = 
u < j or j = p: < i then s M ^i po t^ ~ lpo u^, or ~i po t M ^i po 
implies s M ^i po u^ by Lemma El 

(c) Since t >~i po ui the desired s >~\ po ui for all i < I < n follows 
from the induction hypothesis. 

— Suppose s y\ po t and t y\ po u (1), i.e., Sj ~i po tj for all 1 < 
j < i, Si H po k, s y\ po tj for all i < j < n, n > p, and tj ~i po Uj 
for all 1 < j < p. If i > p then clearly m > p. Transitivity of ~i po 
yields Sj ~i po iij for all 1 < j < p and therefore s y\ po u (1). If i < p 
then ~i po Uj for all 1 < j < i by transitivity of ~i po , Si y\ po Uj, 
and s y\ po Uj for all i < j < p by the induction hypothesis and 
consequently s y\ po u 

— The case where s >-\ po t (1) and t y\ po u is similar to the one 
above. 

— Suppose s ^i po t (1) and t y\ po u (1). Then m > n, Si ~i po tj for all 
1 < i < n, n > p, and tj ~i po Ui for all 1 < i < p. Transitivity of 
~i po yields Si ~i po Ui for all 1 < i < p and together with m > n > p 
proves s >-\ po u (1). 

• Suppose s >-i po t (1) and £ M po u (2). We have u = h(ui, . . . ,u p ) with 
/ ~ g >- h and t ^i po Ui for all 1 < i < p. The induction hypothesis yields 
s >-i po Ui for all 1 < i < p and because / y h also s ^i po u holds by (2). 

• If s >-\ po t (2) and t ^i po u (1) or (2) then f >- g and u = h(u\, . . . ,u p ) 
with g >Z h. As / y h and s y\ po Ui for all 1 < i < p by the induction 
hypothesis also s y\ po u holds by (2). 

• If s y\ po t and t y\ po u (3) then s y\ po ti )Z\po u and thus s y\ po u either 
by the induction hypothesis or Lemma|21 If s >-\ po t (3) and t >~\ po u then 
s -i fclpo t >-\ po u and again s y\ po u either by the induction hypothesis or 
Lemma El 

I^l po is irreflexive: By induction on ||t|| we show that t y\ po t does not hold. 
In the base case ||t|| =0 and thus t = x for some x G V. Therefore t y\ po t 
cannot hold. For the inductive step assume that t' y\ po t' does not hold for 
terms t' with ||t'|| < k {k > 0) and let \\t\\ = k. For a proof by contradiction 
assume that t y\ po t holds. If t y\ po t (1, i) then t{ y\ po ti contradicting the 
induction hypothesis. Also t >~i po t (1) leads to a contradiction because both 
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terms have equally many arguments. If t y\ po t (2) then f >~ f contradicts the 
irreflexivity of y. Finally, if t y\ po t (3) then ti ^\ po t. Because t y\ po U (3), 
ti ^i P o ti follows either by Lemma El or transitivity of ^i po and contradicts the 
induction hypothesis. □ 

Lemma 9. ^i po has the subterm property. 

Proof. As >emb has the subterm property it suffices to show that > em b C >-i po . 
If Z — ^ r €E £mb then r is an argument of I and thus I y\ po t (3). Since ^i po is 
a rewrite order > em b C ^ lpo . □ 

Corollary 1. ^i po is a simplification order. 

B Implementation Details 

The software is mainly written in the functional programming language OCaml 
and integrates some already existing modules for parsing, terms, graph theory, 
etc. from TjT jlUj which has been developed by Nao Hirokawa and Aart Mid- 
deldorp. The non OCaml part is the state of the art SAT solver MiniSat 3 which 
has been employed to test satisfiability of the constraint formula B(X,Y). 

B.l The DIM ACS Input Format 

Although the general input format for MiniSat is rather simple a short descrip- 
tion is provided here because many references on the web are too imprecise and 
directly lead to some pitfalls. 

Example 6. Consider the prepositional CNF formula p A (->(/ V r ) A V —>r) Aq. 
The variables are represented by integer values. E.g., p by 1, q by 2, and r 
by 3. Negated variables are encoded by the corresponding negative value, e.g., 
—>q by —2 and so on. Typically a file starts with some comments and these 
lines begin with c. After that a line like p cnf 3 4 indicates the format of 
the input file. Up to version 1.13 MiniSat also supported the more general sat 
format but the most recent version demands a CNF input (therefore keyword 
cnf). The specified number 3 indicates the number of variables (an upper 
bound) and 4 reflects the number of conjuncts. At least for the latest version 
of MiniSat (vl.14) these two numbers are optional. After that line conjuncts 
in their integer encoding follow. Important is that every conjunct is trailed by 
a whereas the newline is optional. 

c This is a comment line 
c 

p cnf 3 4 

1 

-2 3 
2-3 

2 

Listing 1: A simple SAT instance in the DIMACS format. 
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B.2 Interfacing MiniSat with OCaml 

Chapter 18 of the OCaml reference manual [Zj only explains how to interface 
C code with OCaml. SWIG 9 claims that it is able to interface C++ with 
OCaml but not even the examples given in the documentation could be com- 
piled correctly. Anyway, if no complex data structures are shared the following 
explanations are totally sufficient for a working interface. The only differences 
compared to jTj is the keyword extern "C" in the declaration of the functions 
to prevent the C++ name mangling mechanism and the usage of the g++ com- 
piler instead of cc. In the sequel we first describe how to build a shared C++ 
library. After that a sample OCaml program gets linked against that library. We 
start with the necessary C++ files. As already mentioned it is important to 
decorate the declarations of the functions which are intended to be called from 
OCaml code with extern "C". The two listings below present C++ code for the 
declaration (test .h) as well as the implementation (test . c) of two functions. 

extern "C" void test() ; 
extern "C" int add ( int a, int b); 

Listing 2: test.h 

#include" test . h" 
#include<iostream> 

using namespace std ; 

void test () 
{ 

cout « " it ^c++^works" « endl ; 
}// end test () 

int add ( int a, int b) 

{ 

return a + b; 

} / '/ end add() 

Listing 3: test.c 

These two files are linked into a dynamic shared library with the two commands 

$ g++ -fPIC -g -c -Wall test.c 

$ g++ -shared -o dllmylib . so test.o 

where -fPIC is the flag for 'position independent code', -g enables debugging 
information, and -Wall tells the compiler to show all warnings. The second 
command then produces the shared library (flag -shared) named dllmylib. so 
from the object file test . o. So producing the C++ library was easy. Let's 
now face the interface for the functions which consists of an interface file 
(interface .mli) which declares the functions and their corresponding types 



24 



and the so called stub (interf ace_stubs . c). Concerning the interface file the 
keyword external tells the OCaml compiler that these functions are imple- 
mented elsewhere. After that keyword the name of the function in the OCaml 
file is specified together with its type. After the = the name of the correspond- 
ing C function is needed. As one can see from the example the function names 
in the two implementations might differ but need not. 

external otest: unit — > unit ="test" 
external add : int — > int — > int = "add" 

Listing 4: interface. mli 

The stub has to include test .h as well as other OCaml specific header files. 

#include " test . h" 
^include <caml/ mlvalues . h> 
^include <caml/memory . h> 
^include <caml/ alloc . h> 
^include <caml/custom . h> 

value caml_otest ( value unit) 
{ 

CAMLparaml ( unit ) ; 
test ( ) ; 

CAMLreturn (Val_unit); 

} 

value camLadd ( value a, value b) 
{ 

CAMLparam2 ( a , b ) ; 

CAMLreturn ( Val_int (add ( Int_val (a) , Int _val (b ) ) ) ) ; ; 

} 

Listing 5: interface_stubs.c 
These two files should be compiled with: 
$ ocamlc interf ace .mli 

$ g++ -c -I/usr/local/lib/ocaml interf ace_stubs . c 

Note that the include path of OCaml may vary. With the following main file 

open Interface ; ; 
let main () = 
otest ( ) ; 

let (a,b) = (3 ,4) in 
let n = add a b in 

Printf . printf "Sum^of J^d^andJ/'od^ equals ^%d\n" a b n; 
main ( ) ; ; 

Listing 6: main. ml 
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an executable can be compiled with the command 

$ ocamlc -o a. out main. ml -dllib -lmylib 

Executing a. out should then produce an output similar to 
it C++ works 



Sum of 3 and 4 equals 8 

Listing 7: Example output. 

The attentive reader may have noticed that the sum of 3 and 4 actually equals 
7 and not 8. In the interface this error occurs because OCaml and C++ have 
different representations of integers. As the interface above was constructed 
according to the description in [Jj the problem probably is not due to this 
interface but the OCaml implementation. When an integer from OCaml code is 
passed to C++ code one bit is added (namely a 1 at the position of the least 
significant bit). Therefore not 3 and 4 are added but 7 and 9. From the result 
16 the least significant bit is chopped off again and therefore the result 8 is 
reported. 

Because memory allocation for strings also did not work without troubles all 
data sharing is done by Unix pipes. The file descriptors are redirected to stdout 
and stdin respectively. As a consequence of that up to 30% of the execution time 
is needed to hand over the CNF formula to MiniSat. By a thorough interface 
— which might be some work — that bottleneck should be removable. 



26 



